

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 6, June 2012

# Design and Evaluation of High Performance Multiplier Using Modified Booth Encoding Algorithm

## S.Punnaiah, G.V.Ganesh, T.Praveenblessington, T.KrishnaKarthik, Sk.FazalNoorBasha

Abstract: - In this paper multipliers are used for high performance embedded cores and in all multiplier designs. In conventional two's complement multiplier the main problem is that it requires more computation time. In this paper Computation time of the Two's complement multiplier is reduced by decreasing the maximum height of the partial product array by one row in a radix-4 Modified Booth Encoded Multiplier. The classic two's complement nxn bit multiplier using the radix 4 MBE scheme generates a partial product(PP) array with a maximum height of n/2+1 rows, here we are going to reduce the maximum height of PP array to n/2. This technique allows for faster compression of the partial product array without any increase in the delay and can be extended to higher radix encodings. This technique mostly relies on circuit optimization and minimization of the critical paths. We are using Cadence RTL compiler for synthesize report and Model simulator for simulation results.

*Keywords*: Multiplication Modified Booth Encoding, Partial Product Array.

#### **1. INTRODUCTION**

In multimedia, 3D graphics and signal processing applications, performance, in most cases[1], strongly depends on the effectiveness of the hardware used for computing multiplications, since multiplication is, besides addition, massively used in these environments[2][10]. The high interest in this application field is witnessed by the large amount of algorithms and implementations of the multiplication operation, which have been proposed in the literature (for a representative set of references, see [6], [9]). More specifically, short bitwidth (8-16 bits) two's complement multipliers with single-cycle throughput and latency have emerged and become very important building blocks for highperformance embedded processors and DSP execution cores [7], [5]. In this case, the multiplier must be highly optimized to fit within the required cycle time and power budgets. Another relevant application for short bit-width multipliers is the design of SIMD units supporting different data formats [6], [7]. In this case, short bit-width multipliers often play the role of basic building blocks.

## II. MODIFIED BOOTH ENCODING (MBE) TECHNIQUE

The basic algorithm for multiplication is based on the well-known paper and pencil approach [5],[6] and passes through three main phases: 1) partial product (PP) generation, 2) PP reduction, and 3) final (carry-propagated) addition[7],[12]. Booth recoding can be extended for a redundant multiplier operand [4]. While

[4] starts with a signed-bit Booth recoder and adds a special pre processing step in order to deal with a carrysave operand, the redundant Booth recoder in this work base been optimized for the carry-save representation and reduces the critical path by one XOR gate. During PP generation, a set of rows is generated where each one is the result of the product of one bit of the multiplier by the multiplicand. For example, if we consider the multiplication  $X \times Y$  with both X and Y on n bits and of the form  $xn_1 \dots x0$  and  $yn_1 \dots y0$ , then the ith row is, in general, a proper left shifting of  $yi \times X$ , i.e., either a string of all zeros when  $y_i = 0$ , or the multiplicand X itself when yi = 1. In this case, the number of PP rows generated during the first phase is clearly n. Modified Booth Encoding (MBE) [3] is a technique that has been introduced to reduce the number of PP rows, still keeping the generation process of each row both simple and fast enough[7],[9]. One of the most commonly used schemes is radix-4 MBE, for a number of reasons, the most important being that it allows for the reduction of the size of the partial product array by almost half[3], and it is very simple to generate the multiples of the multiplicand. More specifically, the classic two's complement  $n \times n$  bit multiplier using the radix- 4 MBE scheme, generates a PP array with a maximum height of [n/2]+1 rows[4], each row before the last one being one of the following possible values: all zeros,  $\pm X$ ; $\pm 2X$ . The last row, which is due to the negative encoding, can be kept very simple by using specific techniques integrating two's complement and sign extension prevention [6],[10]. In this work, we introduce an idea to overlap, to some extent, the PP generation and the PP reduction phases. Our aim is to produce a PP array with a maximum height of [n/2] rows that is then reduced by the compressor tree stage.

**III. DESIGN FLOW** 



Fig 1: MBE Signal Generation



ISSN: 2277-3754

## ISO 9001:2008 Certified

# International Journal of Engineering and Innovative Technology (IJEIT)

## Volume 1, Issue 6, June 2012

# A. Explanation of design flow:

Let us take two 8-bit input vectors, A & B are two 8-bit vector inputs and P is the output vector.

 $A = 0\ 0\ 0\ 1\ 0\ 0\ 1\ 1$  (1 9) applying

# MBE RULES:

- Pad LSB with 1 zero
  - $0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 0$
- n is even then pad the MSB with two zeros 00000100110
- Form 3-bit overlapping groups for n=8 we have 5 groups

# A[10]A[9] A[8]A[7]A[6]A[5]A[4]A[3] A[2] A[1] A[0]

0 0 0 0 0 1 0 0 1 1 0 From right to left each bit will be assigned in the below

| 1) .y2i-1=0 | 2). Y2i-1=1 | 3). Y2i-1=0 |
|-------------|-------------|-------------|
| Y 2i=1      | Y 2i=0      | Y 2i=1      |
| Y 2i+1=1    | Y 2i+1=0    | Y 2i+1=0    |

| 4) .y2i-1=0 | 5). Y2i-1=0 |
|-------------|-------------|
| Y 2i=0      | Y 2i=0      |

# Table1: Partial product outputs:

| PP[8] PP[7]     | PP[ | 6] Pl | P[5] P | P[4] ] | PP[3] | PP[2] | PP[1] | ]PP[0] |
|-----------------|-----|-------|--------|--------|-------|-------|-------|--------|
| pp0[8:0] =<br>0 | 1   | 1     | 1      | 1      | 1     | 0     | 1     | 0      |
| pp1[8:0]=<br>1  | 0   | 0     | 0      | 0      | 0     | 1     | 0     | 1      |
| pp2[8:0]=<br>1  | 0   | 0     | 0      | 0      | 0     | 1     | 0     | 1      |
| pp3[8:0]=<br>0  | 0   | 0     | 0      | 0      | 0     | 0     | 0     | 0      |

Y 2i+1=0 Y 2i+1=0

Assign to all bits in MBE Signal Generation circuit then MBE out puts are below

 1) Onei1 = 1 3) onei3 = 1 Twoi1 = 0 Twoi3 = 0 Negi1 = 0 Negi3 = 0
2) Onei2 = 1 4) onei4 = 1

Twoi2 = 0 Twoi4 = 0Negi2 = 0 Neg14 = 0

Similarly applying 8-bit B vector to Partial Product Generation circuit

 $\mathbf{B} = 0 \quad 0 \quad 0 \quad 0 \quad 1 \quad 0 \quad 1 \quad 1 \quad (11)$ 

Applying MBE rules then

# B[9] B[8] B[7] B[6] B[5] B[4] B[3] B[2]B[1]B[0]

- 0 0 0 0 0 1 0 1 1 0
- Form 2-bit overlapping groups for n=8 we have 9 groups.



# FIG 2: Partial Product Generation

| 1)Xj-1=0 | -1=0 2) Xj-1=1 |          |
|----------|----------------|----------|
| Xj =1    | Xj =1          | Xj =0    |
| 4)Xj-1=0 | 5)Xj-1=1       | 6)Xj-1=0 |
| Xj =1    | Xj =0          | Xj =0    |
| 7)Xj-1=0 | 8)Xj-1=0       |          |
| Xj =0    | Xj =0          |          |
|          |                |          |

Assign to all bits in Partial Product Generation circuit then Partial Product Generation out puts are below

PP [0 0] =0, PP [01] =0 PP [02] =1.....PP [09] =1

# A.A. (A and B) Multiplication:

| A 0 0 0 1 0 0 1 1<br>B 0 0 0 0 1 0 1 1 |
|----------------------------------------|
| 11 1110100                             |
| 0000 01011 <mark>01</mark> negi1       |
| 0 0 0 0 0 1 0 1 1 0 0 0 0 negi2        |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  |



ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 6, June 2012



## Fig 3: Gate-Level Diagram Of Proposed Of the Method Is For Adding The Last Neg Bit In The First Row

The above figure shows the gate level diagram of proposed of the method is for adding the last neg bit in the first row i.s last negative bit is added to first row( $0\rightarrow 1$  or  $1\rightarrow 0$ ).

## **IV. RESULTS**

## A. Synthesize report:

Fig: 4 shows the Booth multiplication top view diagram and Fig: 5 Booth multiplications RTL internal diagram using Xilinx 12.0 version. Fig: 7 synthesize report of Booth multiplication using Cadence RTL compiler



Fig: 4 Booth multiplication top view (RTL) diagram



Fig: 5 Booth multiplication RTL internal diagram



Fig: 6 synthesize report of Booth multiplication using Cadence RTL compiler

## **B.**Power report

| Generated by: Encount      | ter(R) RTL         |  |  |  |
|----------------------------|--------------------|--|--|--|
| Compiler                   |                    |  |  |  |
| Generated on:              | Feb 27 2012        |  |  |  |
| 04:44:15 PM                |                    |  |  |  |
| Module:                    | boothmul           |  |  |  |
| Technology library:        | tsmc18 1.0         |  |  |  |
| Operating conditions: slow |                    |  |  |  |
| (balanced tree)            |                    |  |  |  |
| Wireload mode:             | enclosed           |  |  |  |
| Area mode:                 | timing             |  |  |  |
| library                    |                    |  |  |  |
|                            |                    |  |  |  |
|                            |                    |  |  |  |
| Leakag                     | ge Dynamic         |  |  |  |
| Total                      |                    |  |  |  |
| Instance Cells Power (n    | nW) Power (nW)     |  |  |  |
| Power (nW)                 |                    |  |  |  |
|                            |                    |  |  |  |
|                            |                    |  |  |  |
| boothmul 139 158.764       | 322502.747         |  |  |  |
| 322661.510                 |                    |  |  |  |
| C. Simulation Results      |                    |  |  |  |
| Fig 7:Shows the Simulation | n results of Booth |  |  |  |

Fig 7:Shows the Simulation results of Booth Multiplication using ModelSim Simulator input bits are A=00001011,B=00010011 then output is 0000000010.



ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 6, June 2012



Fig 7: Simulation results of Booth Multiplication using Modelsim Simulator

# V. CONCLUSION

With the extra hardware the maximum height of the partial product array has been reduced by one unit. No extra delay in the Partial product generation stage with reduction of maximum height.

## ACKNOWLEDGMENTS

We thanks to our principal Prof **K. Raja Shekar Rao** for providing necessary facilities to wards carrying out this work. We acknowledge the diligent efforts of our head of the department **Dr. Habibulla Khan** in assisting us towards implementatios of this idea.

#### REFERENCES

- [1] Eric Whartona, Dr. Karen Panettac, Dr. Sos Agaian, Digital electronic arithmetic with applications, IEEE Inter. Conf., 2007.
- [2] Design and performance of pixel-level pipelined-parallel architecture for high speed wavelet-based image compression, Computers and Electrical Engineering, 2005.
- [3] G. Deng and L. W. Cahill Logarithmic number system and its application to image processing, Department of Electronic Engineering, La Trobe University, Bundoora Victoria 3083 Australia. EEE 1058- 6393193 \$03.00 0 1993
- [4] R. Hashemian and C.P. Chen "A New Parallel Technique for Design of Decrement/Increment and Two's Complement Circuits," Proc. 34th Midwest Symp. Circuits and Systems, vol. 2, pp. 887-890, 1991.
- [5] J.-Y. Kang and J. L.Gaudiot, "A Fast and Well Structured Multiplier," Proc. Euromicro Symp. Digital System Design, pp. 508-515, Sept. 2004.
- [6] M.D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Publishers, 2003.
- [7] S.K. Hsu, S.K. Mathew, M.A. Anders, B.R. Zeydel, V.G. Oklobdzija, R.K. Krishnamurthy, and S.Y. Borkar, "A 110GOPS/W 16-Bit Multiplier and Reconfigurable PLA Loop in 90-nm CMOS," IEEE J. Solid State Circuits, vol. 41, no. 1, pp. 256-264, Jan. 2006.
- [8] I. Koren, Computer Arithmetic Algorithms. Prenlice Hall. 1993.
- [9] B. Parhami. Computer Arithsteric: Algorithmr ond Hadware, Oxford Unitersity Press, 2000.

- [10] M. D. Ercegovac and T. Lang. Digital Arirhmetic, Morgan Kaufmann Publishen, 2004.
- [11] C. N. Lyu and D. W. Matula, "Redundant Binary Bwch Recoding: Pmc. 12th SI-mp. Comutrer Arithmeric, July 1995, pp. SCL57.
- [12] Y. Du&&ix and H. Mehrez. "A Family of Redundant Multipliers Dedicated to Fast Computation for Signal FTocessing: Pmc. IEEE lm. Spp. Circuits and Sjsrens May 2000. pp. 325-328. in 90-nm CMOS," IEEE J. Solid State Circuits, vol. 41, no. 1, pp. 256-264, Jan. 2006.

#### **Author's Profile**

Punnaiah. Sangati He received B.Tech from JNTU, Kakinada in 2010 and M.Tech pursuing in K L University. His interest focuses on VLSI. <u>punnaiah.sp@gmail.com.</u>

Venkata Ganesh.G He received B.Tech from JNTU, Hyderabad M.Tech from Andhra University in 2007 and 2009 respectively. He is working as Assistant Professor in K L University. His interest focuses on nanoelctronics. Email: <a href="mailto:ganesh.gorla@gmail.com">ganesh.gorla@gmail.com</a>.

T. Praveen Blessing ton, Presently working as an Associate Professor & research scholar in Department of ECE, KL University, Gunter, Andhra Pradesh, India, where he has been engaged in teaching and research in VLSI & embedded designs. He is a member of VLSI Research Group, Department Curriculum Committee (DCC) Member. His interest is research and development in SOC, NOC Architectures, and Low-Power VLSI & Embedded Systems. He has published and presented various International and National reputed journals and conferences. He is a life member of IETE, ISTE and SCIEI. E-Mail: praveentblessington@kluniversity.in.

T. Krishna Karthik was born in Gudivada, Krishna(dist),AP, India. He received B.Tech. in Electronics & Communication Engineering from GEC,AP. India, M.Tech from KL University, Vijay Wada, AP, India. He has undergone 2 international conferences and 2 publishment in IEEE. Email: mail4krishnakarthik@gmail.com

Dr. Fazal Noorbasha, Presently working as an Assistant Professor, Department of Electronics and Communication Engineering, KL University Guntur, Andhra Pradesh, India, where he has been engaged in teaching and research, VLSI Research Group Head, Department Curriculum Committee (DCC) Member. His interest of research and development is Low-power VLSI, High-speed CMOS VLSI SoC, Memory Processors LSI"s, Digital Image Processing, Embedded Systems and Nanotechnology. He has published and presented over 35 Science and Technical papers in various International and National reputed journals and conferences. He is a Scientific and Technical Committee & Editorial Review Board Member in Engineering and Applied Sciences of World Academy of Science Engineering and Technology (WASET), Advisory Board Member of International Journal of Advances Engineering & Technology (IJAET), Life Member of Indian Society for Technical Education (ISTE-India), Member of International Association of Engineers (IAENG-China) and Senior Member of International Association of Computer Science and Information Technology (IACSIT-Singapore).

E-Mail: fazalnoorbasha@kluniversity.in